Sensitivity of complex networks measurements 
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Complex networks obtained from the real-world networks are often characterized by incomplete- 
ness and noise, consequences of limited sampling as well as artifacts in the acquisition process. 
Because the characterization, analysis and modeling of complex systems underlain by complex net- 
works are critically affected by the quality of the respective initial structures, it becomes imperative 
to devise methodologies for identifying and quantifying the effect of such sampling problems on the 
characterization of complex networks. Given that several measurements need to be applied in order 
to achieve a comprehensive characterization of complex networks, it is important to investigate the 
effect of incompleteness and noise on such quantifications. In this article we report such a study, 
involving 8 different measurements applied on 6 different complex networks models. We evaluate 
the sensitiveness of the measurements to perturbations in the topology of the network considering 
the relative entropy. Three particularly important types of progressive perturbations to the network 
are considered: edge suppression, addition and rewiring. The conclusions have important prac- 
tical consequences including the fact that scale-free structures are more robust to perturbations. 
The measurements allowing the best balance of stability (smaller sensitivity to perturbations) and 
discriminability (separation between different network topologies) were also identified. 



I. INTRODUCTION 

Complex networks theory has been largely applied to 
model real- world systems, such as the Internet, the World 
Wide Web, protein interactions, airlines, roads, food 
webs and society [l|, 0; 01 ■ The success of this area is to a 
great extent the consequence of two recent developments: 
increase of computational power and availability of sev- 
eral databases. In the former case, computers allowed 
processing of networks with thousand or even million of 
vertices. In the latter, many maps of interactions, rag- 
ing from biology to social science, have become available 
since the 90's. However, most of these maps are not com- 
plete and methods should be developed to characterize 
these networks [1]. 

Sampling is a fundamental problem in complex net- 
works because the connectivity of many studied real- 
world networks may differ substantially from the origi- 
nal complex systems from which they were derived. This 
effect results in biased models, inaccurate characteriza- 
tion, or incorrect classification and modeling of complex 
systems. In addition, many dynamical processes such 
as resilience to random and target attacks Q, spread- 
ing process Q , synchronization [l| , random walk [3] and 
flow 0] are closely related to the completeness of net- 
works. 

A variety of sampling methods can be considered to 
map a complex system into a network. The sampling 
issue has been recently considered in the analysis of dif- 
ferent cross-section approaches to construct biological, 
information, technological and social networks. For in- 
stance, the available protein-protein interactions cover 
only a fraction of the complete interactome map. As a 
matter of fact, the high-throughput "yeast two- hybrid" 
assay tends to provide a high number of false positives, 
i.e. interactions identified in the experiment but that 
never take place in the cell [^, Il3|- Sprinzak et at. [llj 



suggested that the reliability of the high-throughput Y2H 
is about 50%. Generally, it is assumed that the incom- 
plete maps can be extrapolated to the complete inter- 
actome, so that limited sampling would not affect the 
topological structure of the network [ij] . This assump- 
tion is based on the scale-free structure of protein in- 
teraction networks. However, the subnetworks obtained 
by sampling of scale- free networks are not guaranteed to 
be scale- free [J]. In addition, limited sampling can re- 
sult in scale-free structures irrespective of the original 
network topology [Ij, [ij]. In order to overcome these 
limitations, efforts have been developed to obtain more 
accurate databases of protein interactions [l5| . 

In the case of the World Wide Web, the network struc- 
ture depend strongly on the web crawler applied for sam- 
pling each chosen domain [Ig] . Different sampling strate- 
gies can induce bias, affecting in many ways the resulting 
recovered structure [17]. Indeed, some crawlers tend to 
overestimate the average number of connections of pages. 
A possible solution for such limitations is to start from 
as large a set of pages as possible [l3 • 

Accurate topologies of the Internet are fundamental 
for routing strategies and to forecast its growth. Inter- 
net sampling is generally based on tracerouters — packets 
are sent through the network in order to obtain the IP 
address of the routers in the path. However, it is often 
assumed that these packets follow the shortest paths in 
the network [l3|, implying a large set of connections to 
be missed because of the possible presence of redundant 
links among routers. Moreover, in the traceroute strategy 
edges close to the root are more visible, i.e. the proba- 
bility to obtain a edge far from the root decreases with 
the distance from the root [1^ . It has also been observed 
that the traceroute sampling of random networks leads 
to networks with power-law degree distribution [l9| . 

Social networks are also incomplete. Generally, these 
networks are restricted to a special class of human activ- 



ity (e.g. music, sports, casting and collaborations in sci- 
ence) or are constructed by considering human relations 
(e.g. friendship and relationship). The way in which these 
networks are obtained can often result in biased data, 
such as the boundary specification problem, inaccuracy 
in questionnaire application and inaccessibility of sub- 
jects [20]. Moreover, depending of the considered type 
of personal relationship, it becomes particularly hard to 
define the links. It is a difficult difficult to estimate the 
effects of missing data in social networks. 

In the light of the above discussion, it becomes clear 
that sampling bias might induce properties not represen- 
tative of the actual complex networks, leading to incor- 
rect characterization and modeling. The sampling prob- 
lem can be tackled by considering the following three 
possible approaches: 

1. Improvement of sampling methodologies, 

2. Development of methods to predict missing \2V\ and 
wrong links, 

3. Determination of the most suitable measurements 
to characterize incomplete networks. 

The first strategy depends on the type of the network 
that one wants to sample. The second involves assump- 
tions about rules and constraints for each network struc- 
ture, such as hierarchical organization ,21] . The third 
has the intrinsic advantages of being applicable to all 
already existing networks as well as providing the only 
alternative in cases where the sampling problems cannot 
be completely avoided. Some strategies have also been 
developed to minimize the incomplete sampling problem 
by applying remedial techniques [16| . The work reported 
in the current article relates to the third of the strategies 
above, by quantifying the influence of several types of 
perturbations on complex networks measurements. The 
perturbation of the degree distribution has been inves- 
tigated before [i3i|23- Nevertheless, it is now realized 
that a single type of measurement (i.e. degree) is not 
enough to characterize the structure of networks [23| . In 
addition, Alderson et al. [2J| have showed that networks 
with the same degree distribution can present distinct 
topologies. Therefore, a comprehensive set of measure- 
ment must be taken into account in order to obtain an 
accurate network characterization [23|, implying the ef- 
fect of structural perturbations on several complex net- 
works measurements to become a particularly important 
issue. 

Measurements that are too sensitive to perturbations 
in the network may not be adequate to characterize in- 
complete or noisy networks. Moreover, measurements 
that do not reflect differences between distinct network 
structures are of reduced value because of the implied 
lack of discriminability [23|. In this paper, we analyze the 
most traditional measurements used for networks charac- 
terization by considering three important classes of per- 
turbations: (i) edge removal, (ii) edge inclusion, and (iii) 



edge rewiring. Since these perturbations can be under- 
stood as noise added to networks, we considered infor- 
mation processing theory [2^ in order to quantify the 
sensitivity of network measurements. More specifically, 
we analyzed the distribution of measurements in terms 
of relative entropies (KuUback-Leibler distance). This 
measurement allows to determine the "distance" in bits 
between two probability mass function. In this way, we 
obtained the distribution of a given measurement p and 
the distribution of the same measurement after the net- 
work perturbation, q. The entropy calculated taking into 
account these two distributions quantify how much they 
are different (the relative entropy is always larger than 
zero). Thus, by inspecting the behavior of the measure- 
ments under these perturbations, we were able to iden- 
tify the candidate measurements most suitable for anal- 
ysis and characterization of networks constructed with 
incomplete data or in the presence of noise. We analyzed 
8 different measurements on 6 different complex networks 
models. 



II. BASIC CONCEPTS AND METHODOLOGY 

An undirected complex network (or graph) G is de- 
fined as G = {V, E), where V is the set of N nodes and 
E is the set of M undirected edges of the type {i-.j}, 
indicating that the nodes i and j are connected. An 
undirected complex network without multiple edges can 
be represented in terms of its adjacency matrix A, whose 
elements Oy and aji are equal to one whenever there is a 
connection between the vertices i and j; and equal to 
otherwise. Since most real-world networks are composed 
by thousand or even million of vertices, the analysis of 
their structure cannot be performed by visual inspection. 
In this way, a set of measurements are considered in order 
to describe and discriminate network topologies. These 
measurements can reflect different features on the net- 
work, such as connectivity, assortativity, centrality and 
hierarchies. In this work, we considered the distribution 
of the following representative set of measurements in or- 
der to characterize the network structures [23| . The cho- 
sen measurements include more traditional and simpler 
measurements such as node degree and clustering coeffi- 
cient as well as more recent and sophisticate features such 
as betweeness centrality and hierarchical measurements. 

• Degree: the degree of a node z, fe;, is given by its 
number of connections. 

• Average degree of nearest neighbors: The average 
neighbor connectivity, fc„„, measures the average 
degree of the neighbors of the vertices in the net- 
work. 

• Clustering coefficient: The clustering coefficient of 
a node i, Ci, is defined as the number of links be- 
tween the vertices within its neighborhood, li, di- 
vided by the number of edges that could possibly 
exist between them {ki{ki — l)/2). 



• Hierarchical measurements: Hierarchical measure- 
ments are defined by considering the successive 
neighborhoods around each node [23, [20|. Such 
measurements are particularly interesting because 
they reflect several topological scales around each 
reference node, from purely local (first neighbor- 
hood) to completely global (the most distance 
neighbors). The ring of vertices Rd{i) (or hier- 
chical/concentric level) is formed by those vertices 
distant d edges from the reference vertex i. 

— Hierarchical degree at level d, hkd{i), is de- 
fined as the number of edges connecting the 
rings Rdii) and Rd+i{i). 

— Hierarchical clustering coefficient ,hCd, is 
given by the number of edges among nodes 
in the respective d-ring {md{i)), divided by 
the total number of possible edges between the 
vertices in that ring. 

— Divergence ratio, hdrd, corresponds to the ra- 
tio between the number of vertices in the ring 
at level d+1 and the hierarchical node degree 
at level d. 

• Shortest path length: The shortest path length be- 
tween two vertices i and j , iij , is given by the short- 
est distance between that pair of vertices. 

• Betweenness centrality: The betweenness centrality 
of a vertex i, Bi, quantifies the fraction of shortest 
paths between each pair of nodes in the network 
that pass through this vertex. 



A. Perturbation methods 

In this work, the noise and incompleteness frequently 
found in complex networks derived from real-world data 
are modeled in terms of three basic types of structural 
perturbations, namely: 

• Edge removal: Edges are selected at random and 
removed from the network. 

• Edge addition: Two not connected vertices are se- 
lected at random, and a connections is established 
between them. 

• Edge rewiring: Two pairs of connected vertices are 
chosen and their connections are interchanged. 

In our analysis, we also considered a random combina- 
tion of all these types of perturbations. 

The intensity of the perturbations ranged from to 
10% of the total number of edges in the network. In the 
case of the rewiring perturbation, the number of steps 
necessary to reach 10% of edges was half of that required 
for the other two because each step corresponded to a 
change of two edges. Perturbations involving vertices 



could also be considered. However, the addition of ver- 
tices should depend on the type of network in question. 
In order to make our analysis simpler and more robust, 
we focused edge perturbations. The behavior of the mea- 
surements was therefore studied with respect to several 
types of edge perturbations. 



B. Relative entropy 

In statistical mechanics, the entropy is a measure of 
uncertainty or disorganization in a physical system [27| . 
In principle, the entropy is given by the logarithm of the 
number of ways in which a system can be configured. The 
concept of entropy has many application to different re- 
search areas. For instance, while in quantum mechanics, 
the entropy is related to the von Neumann entropy 28|: 
in complexity theory, to the Kolmogorov entropy 29|. 
Here, we consider the concept of entropy in the sense 
of information theory, where entropy is used to quantify 
the minimum descriptive complexity of a random vari- 
able [2^. In this case, the entropy of a discrete random 
distribution p{x) is given as 



^(P) = -^P(^) logp(a;), 



(1) 



where the logarithm is taken on the base 2. In case of 
complex networks, many of their properties result from 
heterogeneity of their connections. Jun et al. [3(1 sug- 
gested the consideration of the normalized entropy of 
rank distribution in order to analyze scale-free networks. 
The relative entropy, or KuUback-Leibler distance, 
measures the "distance" in bits between two probability 
mass function p{x) and q{x) and is defined as 



D{p,q) = ^p(x)log 



q{x)' 



(2) 



Such value is always nonnegative and is zero if and 
only if p — q. Typically log 2=0^0 log - = and 



Plogf 



p — 



Therefore, Up is the distribution of a given 



network measurement and q is the distribution of the 
same measurement obtained from the respective network 
under presence of noise, the relative entropy provides a 
sound means to quantify the intensity of changes implied 
by the noise on the distribution p. In this work we con- 
sidered the relative entropy to determine the sensitivity 
of different network measurements. 



C. Analyzed Networks 

In order to study the effects of perturbations on net- 
works, we considered structures generated by six differ- 
ent network models, including traditional structures such 
as random, scale-free and geographical models as well as 



more recent models such as limited and non-linear prefer- 
ential attachment. The consideration of several models is 
fundamental for investigating the effect of perturbations 
in networks because the implied changes are strongly de- 
pendent on specific network connectivity. Also, because 
of the distinct properties of these models and that we can 
generate ensembles of networks, it becomes immediately 
possible to quantify the discrimination of the measure- 
ments with respect to such different types of structures. 



1. Theoretical models 

Since the perturbation dynamics can depend on the 
network structure, we considered the following network 
models [V\. 

• Erdos-Renyi random graph (ER): This model gen- 
erates networks with random distribution of con- 
nections. The network is constructed connecting 
each pair of vertices in the network with a fixed 
probability p [3l[ . This model generates a Poisson 
hke degree distribution |32 |. 

• Small-world model of Watts and Strogatz (WS): 
To construct this type of small-word network, one 
starts with a regular ring lattice of N vertices in 
which each vertex is connected to n nearest neigh- 
bors in each direction. Each edge is then randomly 
rewired with probability q [33j . 

• Barahdsi- Albert scale-free model (BA): This model 
generates networks with power law degree distribu- 
tion. The network is generated by starting with a 
set of Too vertices and, at each time step, the net- 
work grows with the addition of a new vertex with 
m links. The vertices which receive the new edges 
are chosen following a linear preferential attach- 
ment rule, i.e. the probability of the new vertex i 
to connect with an existing vertex j is proportional 
to the degree of j, V{i — > j) = kj/ Y^^^ A:„ [34J . 

• Waxman geographical model (WG): Geographical 
networks can be constructed by distributing N 
vertices at random in a 2D space and connecting 
them according to the distance [33]. This model 
is created by randomly distributing N vertices in 
a square of length L = VW and connecting them 
with probability p — e~^'^^ where d is their geo- 
graphic distance, and A is a constant adjusted to 
achieve the desirable average degree. 

• Limited scale- free model (LSF): The network is gen- 
erated as in the BA model but the maximum degree 
is limited to a maximum kmax value [36i | . 

• Nonlinear preferential attachment network 
model (NLBA): The network is constructed 
as in the BA model, but instead of a linear prefer- 
ential attachment rule, the vertices are connected 



following a nonlinear preferential attachment rule, 
i.e., Pi^j — kj/J2u^u- III this case, while for 
a < 1, the network has a stretched exponential 
degree distribution, for a > 1 a single site connects 
to nearly all other sites [37| . 



III. RESULTS AND DISCUSSIONS 

Our simulations considered the following parameters: 
N = 1,000 vertices; average degree 6; in case of WS 
model, the probability q of reconnection was 0.3; A was 
1.0 for WG model; a = 0.5 for the NLBA network model; 
and the maximum degree was k^ax = 50 for the LSF 
network mode. 

The perturbations were performed from 0.5% up to 
10% of the total number of edges of each network in steps 
of 0.5%. Also, for each network model, 20 networks were 
generated at each step. For each network, we obtained 
the normalized distribution of measurements consider- 
ing 50 boxes. The histograms for every network were 
obtained by taking into account the same maximum and 
minimum values for each measurement. In order to quan- 
tify the variation on the distribution, we calculated the 
relative entropies by considering the equation III Bl 

Figure [T] present the results with respect to each of 
the 6 network models, and Table [l] shows the average of 
the relative entropy for all network models considering 
10% of edge perturbations. The main results observed 
are discussed as follows: 

• Random edge removal causes smaller variations in 
the measurements than the other three types of 
perturbations (including the random combination 
of all perturbations). From this result, we can con- 
clude that it is better not to include edges about 
which we are uncertain, as the inclusion of an un- 
existent edge implies larger deviations of the mea- 
surements than the removal of an existing one. 

• Comparing the results for all networks, according 
to each perturbation, the measurements can be or- 
dered as following according to the values of the 
maximum entropy: 

— Edge addition: 

B, C, k, hdr2, hC2, knn, hk^, L 

— Edge rewiring: 

k, knn, C, hC2, B, hdr2, hk2, L 

— Edge removing: 

/1C2, i?, C, hdr2, k, i, knn, hk2- 

— All three perturbation together: 
k, /1C2, knn, C, hdr2, B, hk2, £■ 

Depending on the perturbation, the measurements 
can be more or less sensitive. Therefore, it is impor- 
tant to select the appropriated set of measurements 



TABLE I: Average variation of entropy for each plot in Fig- 
ure [l] considering the maximum perturbation with respect to 
edges addition, edges rewiring, edge removing and all these 
perturbations applied together. 



Meas. 


Addition 


Rewiring 


Removing 


Altogether 


k 


0.0269 


0.0000 


0.0418 


0.0006 


rCnn 


0.1207 


0.0003 


0.0514 


0.0010 


hk2 


0.3216 


0.0605 


0.2060 


0.0412 


C 


0.0127 


0.0197 


0.0050 


0.0122 


hC2 


0.0354 


0.0245 


0.0003 


0.0008 


hdr2 


0.0333 


0.0310 


0.0353 


0.0188 


e 


0.8218 


0.4992 


0.0510 


0.4771 


B 


0.0031 


0.0268 


0.0031 


0.0215 


Average 


0.1719 


0.0828 


0.0493 


0.0717 


St. deviation 


0.2828 


0.1693 


0.0669 


0.1644 



according to the type of perturbation, since partic- 
ularly sensitive measurements can lead to wrong 
network characterization. 

• The shortest path length is the most sensitive net- 
work measurement, specially for the geographical 
WS and WG models. This result was expected be- 
cause just a few rewiring in a reg ular network can 
lead to a small- world network [33l |. In other words, 
adding or rewiring edges can connect vertices which 
are far away, reducing the average shortest path 
length. Therefore, this measurement is not partic- 
ularly suitable for networks exhibiting geographical 
organization. 

• Among the network models, the scale-free struc- 
tures resulted as those presenting the less sensi- 
tive structures, being the LSF and NLBA the most 
robust. Indeed, these network models generate 
topologies which are more close to real world net- 
works than the BA model. For instance, the BA 
model tend to generate networks whose average 
clustering coefficient is smaller than that observed 
in real world networks. On the other hand, the LSF 
and NLBA can overcome this limitation by consid- 
ering appropriated parameters. The fact that scale- 
free networks are less sensitive to perturbation dy- 
namics is a fundamental finding because most real- 
world networks are scale-free. Indeed, scale-free 
network have been previously observed to be highly 
resilience against random failures @, although just 
the average shortest paths length was investigated 
in that work. On the other hand, WG and WS 
models present the structures most sensitive to the 
considered perturbations. 

In order to quantify the discriminability of each mea- 
surement, we resourced again to the mutual information 
in order to obtain the "distance" between pairs of differ- 
ent types of models in the absence of perturbations. So, 
high values of relative entropy suggest good separability 
between models. Tables [ll] to IIXI present the pairwise 
comparison for the six models for each of the eight mea- 



TABLE II: Relative entropy of degree distribution, where 
ER is the Erdos and Renyi random model, WS is the Watts 
and Strogatz small-world model, BA is the Barabasi and Al- 
bert scale-free model, WG is the Waxman geographic model, 
NLBA is the non-linear preferential attachment model, and 
LSF is the limited scale-free model. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


0.502 


0.424 


0.005 


0.306 


0.504 


WS 


0.468 


0.000 


0.957 


0.527 


1.125 


1.366 


BA 


0.377 


1.767 


0.000 


0.325 


0.037 


0.001 


WG 


0.005 


0.548 


0.396 


0.000 


0.270 


0.467 


NLBA 


0.527 


1.708 


0.047 


0.544 


0.000 


0.040 


LSF 


0.704 


2.056 


0.022 


0.719 


0.082 


0.000 



surements. For instance, in the case of the degree dis- 
tribution (see table ITT)) , the relative entropy between the 
BA and LSF network models is the smallest one, followed 
by the relative entropy between the ER and WG mod- 
els. The betweenness centrality is the measurement that 
provided the poorest separation between models. This 
is mainly due to the lack of community structure of the 
considered network models. 

Analyzing the average values on the tables Ullto lIXl the 
measurements that provide highest discriminiation (high- 
est values of entropy) are in order presented in Table [X] 
Interesting to note that the degree distribution, which is 
largely considered to characterize complex network mod- 
els, does not performed particularly well in our analysis. 
For instance, while the relative entropy between the ER 
and BA degree distributions is 0.424, for the hierarchical 
clustering coefficient of level 2 it is 5.581. Indeed, the 
hierarchical measurements accounted for the best over- 
all characterization of network structures with respect to 
discriminability. This property is possibly related to fact 
that the hierarchical measurements take into account a 
larger portion of the original network, therefore providing 
a more comprehensive quantification of the local topol- 
ogy. 

The main motivation of our studying of perturbations 
in networks was to find measurements allowing an accept- 
able compromise between stability and discriminability. 
In this way, a proper measurement to characterize sam- 
pled networks should be that which provide good char- 
acterization of network structure and small sensitivity to 
perturbations. Figure [H shows scatterplots defined by 
the sensitivity with respect to edge addition (Fig. [2^), 
rewiring ((Fig. Wp): removal (Fig. [^t) and joint pertur- 
bations (Fig. [2Ji) versus the discriminability consider- 
ing the average relative entropy in each respective case. 
The best measurements are those resulting at the lower 
righ-hand portions of these scatterplots. Therefore, the 
measurements allowing the overall best combinations of 
sensitivity and discriminability include the /1C2, hk2, C 
and hdr2- Interestingly, the node degree, betweeness and 
shortest path - which have frequent and intensively used 
for networks characterization - are either too sensitive or 
not discriminative. 



TABLE III: Relative entropy of average degree of nearest 
neighbors distribution. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


1.454 


1.688 


0.102 


0.711 


1.563 


WS 


0.816 


0.000 


2.197 


1.021 


1.760 


2.659 


BA 


1.854 


0.814 


0.000 


1.786 


1.011 


0.162 


WG 


0.137 


1.780 


1.592 


0.000 


0.549 


1.373 


NLBA 


1.312 


1.677 


0.627 


1.372 


0.000 


0.387 


LSF 


1.907 


0.375 


0.121 


2.291 


0.638 


0.000 



TABLE IX: Relative entropy of vertex betweenness centrality 
distribution. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


0.138 


0.076 


0.250 


0.317 


0.556 


WS 


0.111 


0.000 


0.114 


0.366 


0.697 


1.054 


BA 


0.039 


0.067 


0.000 


0.215 


0.018 


0.004 


WG 


0.390 


0.398 


0.329 


0.000 


0.244 


0.433 


NLBA 


0.286 


0.656 


0.025 


0.204 


0.000 


0.046 


LSF 


0.435 


0.892 


0.010 


0.342 


0.053 


0.000 



TABLE IV: Relative entropy of hierarchical degree of level 2 
distribution. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


2.621 


2.272 


2.386 


0.562 


1.597 


WS 


1.949 


0.000 


6.353 


0.276 


3.158 


5.447 


BA 


1.353 


0.374 


0.000 


1.271 


0.741 


0.088 


WG 


1.655 


0.428 


6.319 


0.000 


3.212 


5.406 


NLBA 


0.739 


1.763 


0.786 


2.064 


0.000 


0.393 


LSF 


1.115 


1.034 


0.091 


1.933 


0.360 


0.000 



TABLE V: Relative entropy of clustering coefficient distribu- 
tion. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


3.226 


0.113 


1.748 


0.017 


0.074 


WS 


3.776 


0.000 


3.659 


0.393 


5.361 


3.979 


BA 


0.256 


2.614 


0.000 


1.297 


0.085 


0.009 


WG 


3.009 


0.399 


1.954 


0.000 


3.095 


2.328 


NLBA 


0.029 


3.052 


0.056 


1.616 


0.000 


0.028 


LSF 


0.147 


2.738 


0.008 


1.385 


0.037 


0.000 



TABLE VI: Relative entropy of hierarchical clustering coeffi- 
cient of level 2 distribution. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


5.581 


2.149 


5.311 


0.872 


1.976 


WS 


5.741 


0.000 


1.955 


1.674 


2.630 


1.879 


BA 


2.054 


2.143 


0.000 


4.148 


0.185 


0.050 


WG 


7.366 


1.642 


4.859 


0.000 


6.107 


5.262 


NLBA 


1.196 


2.738 


0.249 


4.446 


0.000 


0.227 


LSF 


2.371 


1.794 


0.052 


3.991 


0.200 


0.000 



TABLE VII: Relative entropy of hierarchical divergence ratio 
of level 2 distribution. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


2.701 


0.363 


2.922 


0.118 


0.280 


WS 


4.849 


0.000 


1.948 


0.588 


2.197 


1.785 


BA 


0.484 


1.564 


0.000 


2.514 


0.130 


0.027 


WG 


2.880 


0.766 


3.269 


0.000 


3.110 


2.602 


NLBA 


0.248 


2.350 


0.137 


2.324 


0.000 


0.071 


LSF 


0.509 


2.137 


0.032 


2.227 


0.065 


0.000 



TABLE VIII: Relative entropy of shortest path length distri- 
bution. 





ER 


WS 


BA 


WG 


NLBA 


LSF 


ER 


0.000 


0.377 


0.620 


1.743 


0.092 


0.397 


WS 


0.503 


0.000 


2.811 


1.339 


1.254 


2.269 


BA 


0.403 


1.258 


0.000 


2.288 


0.167 


0.021 


WG 


4.647 


1.863 


1.093 


0.000 


2.797 


0.841 


NLBA 


0.068 


0.683 


0.214 


1.952 


0.000 


0.096 


LSF 


0.266 


1.064 


0.022 


2.184 


0.079 


0.000 



TABLE X: Average and standard deviation among the values 
present in Tables Ull to IIXI 



Measurement 


Av. and std 


hC2 


2.69 (2.08) 


hk2 


1.92 (1.81) 


C 


1.55 (1.57) 


hdr2 


1.51 (1.31) 


i^nn 


1.19 (0.71) 


i 


1.11 (1.10) 


k 


0.56 (0.55) 


B 


0.29 (0.27) 



IV. CONCLUSIONS 

Much of the success of complex network research has 
relied on the accurate modeling of complex phenomena. 
To reach this goal, efforts should be concentrated in de- 
veloping methods able to obtain databases and measure- 
ments that can characterize networks structures with ac- 
curacy. Thus, the development of improved sampling 
techniques and analysis of the behavior of measurements 
with respect to incomplete networks or networks with bi- 
ased connections are fundamental for complex networks 
research. In this paper, we reported an analysis of net- 
work measurements with respect to progressively per- 
turbed networks. The perturbations were performed at 
the edge level, considering random removal, addition and 
rewiring. We applied the relative entropy in order to 
quantify the robustness of the network measurements 
considering six representative network models. The four 
measurements most suitable to analyze perturbed net- 
work were identified as: the hierarchical clustering coef- 
ficient (/1C2), hierarchical degree {hk2), clustering coef- 
ficient (C) and divergence ratio {hdr2). It is interesting 
to note that the node degree did not result as the best 
network measurement, being associated with poor dis- 
crimination between networks with distinct structures. 
For instance, while the relative entropy between the ER 
and WG model is just 0.005 when the degree distribution 
is considered, it increases to 5.741 when the hierarchical 
clustering coefficient is used instead. Among the network 
models, structures with scale-free organization presented 
the highest robustness when submitted to perturbations. 
We suggest as future works the consideration of other 
complex network measurements as well as other types 
of perturbations, such as node removal or perturbation 
with preferential rules. The consideration of multivariate 



statistical methods (e.g. MANOVA [38"!) and data mining 
techniques can also help complementing the perturbation 
and discrimination analysis. 
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FIG. 1: Measurement entropies for the considered network models: () - Erdos and Renyi's random graph, A - Watts 
and Strogatz's small-world model, O ~ Barabasi and Albert scale-free model, D - Waxman's geographic model, y - 
Krapvisky's non- liner preferential attachment model, and I> - Amaral et al.'s limited scale- free model 
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FIG. 2: Discrimination versus sensitivity of network measurements considering different types of perturbations, considering the 
values of the Tables |TI] to |IX] and Table H] 
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